home *** CD-ROM | disk | FTP | other *** search
- Path: keats.ugrad.cs.ubc.ca!not-for-mail
- From: c2a192@ugrad.cs.ubc.ca (Kazimir Kylheku)
- Newsgroups: comp.lang.c
- Subject: Re: Fast Fields?
- Date: 14 Feb 1996 18:40:14 -0800
- Organization: Computer Science, University of B.C., Vancouver, B.C., Canada
- Message-ID: <4fu6eeINNm4a@keats.ugrad.cs.ubc.ca>
- References: <4fqj6v$i71@hermes.louisville.edu>
- NNTP-Posting-Host: keats.ugrad.cs.ubc.ca
-
- In article <4fqj6v$i71@hermes.louisville.edu>,
- Alan Wild <arwild01@homer.louisville.edu> wrote:
- >Actually I'm more concerned with convienent. I am currently coding a
- >small routine to parse a space delimted text file, but I only need to
- >keep field #6 of 8. I know I could simply declare temporary variables
- >to store the extra functions and fscanf my way through the file, but is
- >there an alternative?
- >
- >Basically I'm lookng for some of the functionality of perl/awk in a
- >C routine so that I don't have to deal with dummy variables and whatnot.
- >
- >BTW, This project has to be done in C. This is only a small piece of
- >the whole puzzle, but an essential piece nonetheless.
- >
- >Any thoughts?
-
- Try learning how to use the ``lex'' lexical analyzer generator. Though you
- don't write a lex specification in C (well, not exactly), the resulting program
- that it writes for you is C.
-
- In a nutshell, you give lex a bunch of patterns to match associated with C
- snippets which it will perform when matches happen. Lex uses the spec to
- compiler a fast, table-driven textfile snarfer that blows the doors off scanf()
- and friends for anything non-trivial.
-
- You won't quite get the high-level functionality of awk. The concept of
- pattern/action pairs may sound similar to awk, but it isn't, exactly. For one
- thing, only one pattern can match at a time (under normal operation).
- Secondly, the input is treated as a stream of characters, not lines. Patterns
- that are matched are extracted from the stream.
-
- I almost always use lex together with yacc, even for parsing data files that
- have trivial grammars.
-
-
- These tools may be overkill for what you are doing, but once you master lex,
- you will find yourself using it for even trivial jobs like extracting space
- separated fields from a data file. The lex spec is a lot easier to maintain
- than hand-written scanning code.
-
- If you want to do it quickly, it's probably best to avoid scanf. Buffer entire
- lines and scan through them using quick loops to look for spaces and extract
- what you want. Or build a state machine which uses fgetc() to read individual
- characters, and returns tokenized fields.
-
- --
-
-